Overview

Dataset statistics

Number of variables46
Number of observations131
Missing cells217
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory47.2 KiB
Average record size in memory369.0 B

Variable types

DateTime5
Numeric11
Categorical29
Unsupported1

Warnings

num_ide_ has a high cardinality: 131 distinct values High cardinality
no_ide_rn has a high cardinality: 131 distinct values High cardinality
semana is highly correlated with nregHigh correlation
nacionali_ is highly correlated with gp_migrantHigh correlation
cod_dpto_o is highly correlated with cod_dpto_rHigh correlation
cod_mun_o is highly correlated with cod_mun_rHigh correlation
gp_migrant is highly correlated with nacionali_High correlation
gp_gestan is highly correlated with gp_otrosHigh correlation
sem_ges_ is highly correlated with edad_rn and 1 other fieldsHigh correlation
gp_otros is highly correlated with gp_gestanHigh correlation
cod_dpto_r is highly correlated with cod_dpto_oHigh correlation
cod_mun_r is highly correlated with cod_mun_oHigh correlation
edad_rn is highly correlated with sem_ges_High correlation
sem_gest is highly correlated with sem_ges_High correlation
num_em_pre is highly correlated with num_hi_vivHigh correlation
num_hi_viv is highly correlated with num_em_preHigh correlation
nreg is highly correlated with semanaHigh correlation
semana is highly correlated with nregHigh correlation
cod_pre is highly correlated with nit_upgdHigh correlation
nacionali_ is highly correlated with gp_migrantHigh correlation
cod_dpto_o is highly correlated with cod_dpto_rHigh correlation
cod_mun_o is highly correlated with cod_mun_rHigh correlation
ocupacion_ is highly correlated with nit_upgdHigh correlation
gp_migrant is highly correlated with nacionali_High correlation
gp_gestan is highly correlated with gp_otrosHigh correlation
sem_ges_ is highly correlated with sem_gestHigh correlation
gp_otros is highly correlated with gp_gestanHigh correlation
cod_dpto_r is highly correlated with cod_dpto_oHigh correlation
cod_mun_r is highly correlated with cod_mun_oHigh correlation
nit_upgd is highly correlated with cod_pre and 1 other fieldsHigh correlation
sem_gest is highly correlated with sem_ges_High correlation
num_em_pre is highly correlated with num_hi_vivHigh correlation
num_hi_viv is highly correlated with num_em_preHigh correlation
nreg is highly correlated with semanaHigh correlation
semana is highly correlated with nregHigh correlation
nacionali_ is highly correlated with gp_migrantHigh correlation
cod_dpto_o is highly correlated with cod_dpto_rHigh correlation
cod_mun_o is highly correlated with cod_mun_rHigh correlation
gp_migrant is highly correlated with nacionali_High correlation
gp_gestan is highly correlated with gp_otrosHigh correlation
sem_ges_ is highly correlated with sem_gestHigh correlation
gp_otros is highly correlated with gp_gestanHigh correlation
cod_dpto_r is highly correlated with cod_dpto_oHigh correlation
cod_mun_r is highly correlated with cod_mun_oHigh correlation
sem_gest is highly correlated with sem_ges_High correlation
num_em_pre is highly correlated with num_hi_vivHigh correlation
num_hi_viv is highly correlated with num_em_preHigh correlation
nreg is highly correlated with semanaHigh correlation
cod_ase_ is highly correlated with nit_upgd and 11 other fieldsHigh correlation
ndep_proce is highly correlated with cod_mun_r and 9 other fieldsHigh correlation
gp_migrant is highly correlated with nacionali_ and 8 other fieldsHigh correlation
cod_mun_r is highly correlated with ndep_proce and 9 other fieldsHigh correlation
nacionali_ is highly correlated with gp_migrant and 8 other fieldsHigh correlation
niv_edu_ma is highly correlated with nit_upgd and 3 other fieldsHigh correlation
edad_ is highly correlated with num_hi_viv and 4 other fieldsHigh correlation
cod_dpto_r is highly correlated with ndep_proce and 8 other fieldsHigh correlation
nit_upgd is highly correlated with cod_ase_ and 15 other fieldsHigh correlation
ndep_resi is highly correlated with ndep_proce and 8 other fieldsHigh correlation
pac_hos_ is highly correlated with cod_ase_ and 2 other fieldsHigh correlation
gp_gestan is highly correlated with cod_ase_ and 7 other fieldsHigh correlation
version is highly correlated with tip_doc_rn and 7 other fieldsHigh correlation
cod_mun_o is highly correlated with ndep_proce and 11 other fieldsHigh correlation
tip_doc_rn is highly correlated with cod_ase_ and 7 other fieldsHigh correlation
tip_ss_ is highly correlated with cod_ase_ and 8 other fieldsHigh correlation
num_em_pre is highly correlated with fec_aju_ and 2 other fieldsHigh correlation
fec_aju_ is highly correlated with ndep_proce and 29 other fieldsHigh correlation
area_ is highly correlated with cod_mun_r and 6 other fieldsHigh correlation
sem_ges_ is highly correlated with cod_ase_ and 11 other fieldsHigh correlation
talla_nace is highly correlated with cod_mun_o and 8 other fieldsHigh correlation
num_hi_viv is highly correlated with edad_ and 5 other fieldsHigh correlation
sem_gest is highly correlated with gp_migrant and 7 other fieldsHigh correlation
gp_otros is highly correlated with cod_ase_ and 7 other fieldsHigh correlation
nmun_proce is highly correlated with cod_ase_ and 13 other fieldsHigh correlation
peso_nacer is highly correlated with nmun_proce and 1 other fieldsHigh correlation
nmun_resi is highly correlated with cod_ase_ and 12 other fieldsHigh correlation
mult_embar is highly correlated with fec_aju_ and 2 other fieldsHigh correlation
semana is highly correlated with version and 5 other fieldsHigh correlation
nom_upgd is highly correlated with cod_ase_ and 10 other fieldsHigh correlation
fec_not is highly correlated with cod_ase_ and 28 other fieldsHigh correlation
fecha_nac is highly correlated with ndep_proce and 29 other fieldsHigh correlation
nombre_nacionalidad is highly correlated with gp_migrant and 8 other fieldsHigh correlation
tip_ide_ is highly correlated with gp_migrant and 6 other fieldsHigh correlation
ocupacion_ is highly correlated with fec_not and 2 other fieldsHigh correlation
cod_dpto_o is highly correlated with ndep_proce and 9 other fieldsHigh correlation
estrato_ is highly correlated with gp_migrant and 10 other fieldsHigh correlation
edad_rn is highly correlated with version and 5 other fieldsHigh correlation
fec_con_ is highly correlated with edad_ and 21 other fieldsHigh correlation
sexo is highly correlated with fecha_nac and 1 other fieldsHigh correlation
nreg is highly correlated with version and 5 other fieldsHigh correlation
cod_pre is highly correlated with cod_ase_ and 3 other fieldsHigh correlation
cod_ase_ is highly correlated with gp_migrant and 7 other fieldsHigh correlation
ndep_proce is highly correlated with cod_dpto_r and 5 other fieldsHigh correlation
gp_migrant is highly correlated with cod_ase_ and 5 other fieldsHigh correlation
nacionali_ is highly correlated with cod_ase_ and 5 other fieldsHigh correlation
cod_dpto_r is highly correlated with ndep_proce and 5 other fieldsHigh correlation
nit_upgd is highly correlated with cod_ase_ and 4 other fieldsHigh correlation
ndep_resi is highly correlated with ndep_proce and 5 other fieldsHigh correlation
pac_hos_ is highly correlated with sem_ges_High correlation
gp_gestan is highly correlated with nit_upgd and 3 other fieldsHigh correlation
tip_doc_rn is highly correlated with cod_ase_ and 3 other fieldsHigh correlation
tip_ss_ is highly correlated with cod_ase_ and 4 other fieldsHigh correlation
area_ is highly correlated with nmun_proce and 1 other fieldsHigh correlation
sem_ges_ is highly correlated with ndep_proce and 7 other fieldsHigh correlation
sem_gest is highly correlated with sem_ges_High correlation
gp_otros is highly correlated with cod_ase_ and 4 other fieldsHigh correlation
nmun_proce is highly correlated with ndep_proce and 5 other fieldsHigh correlation
nmun_resi is highly correlated with ndep_proce and 5 other fieldsHigh correlation
nom_upgd is highly correlated with cod_ase_ and 7 other fieldsHigh correlation
tip_ide_ is highly correlated with gp_migrant and 3 other fieldsHigh correlation
nombre_nacionalidad is highly correlated with cod_ase_ and 5 other fieldsHigh correlation
cod_dpto_o is highly correlated with ndep_proce and 5 other fieldsHigh correlation
estrato_ is highly correlated with tip_doc_rnHigh correlation
cod_ase_ has 23 (17.6%) missing values Missing
estrato_ has 67 (51.1%) missing values Missing
sem_ges_ has 107 (81.7%) missing values Missing
nit_upgd has 9 (6.9%) missing values Missing
niv_edu_ma has 2 (1.5%) missing values Missing
nom_upgd has 9 (6.9%) missing values Missing
num_ide_ is uniformly distributed Uniform
no_ide_rn is uniformly distributed Uniform
nreg is uniformly distributed Uniform
num_ide_ has unique values Unique
no_ide_rn has unique values Unique
nreg has unique values Unique
fec_hos_ is an unsupported type, check if it needs cleaning or further analysis Unsupported
edad_rn has 5 (3.8%) zeros Zeros
num_em_pre has 62 (47.3%) zeros Zeros

Reproduction

Analysis started2021-07-10 22:40:54.191591
Analysis finished2021-07-10 22:46:40.725209
Duration5 minutes and 46.53 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

fec_not
Date

HIGH CORRELATION

Distinct79
Distinct (%)60.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2021-01-06 00:00:00
Maximum2021-06-21 00:00:00
2021-07-10T22:46:41.326011image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:42.131832image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

semana
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct23
Distinct (%)17.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.16030534
Minimum1
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:46:42.929882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median11
Q316
95-th percentile21.5
Maximum23
Range22
Interquartile range (IQR)11

Descriptive statistics

Standard deviation6.141682047
Coefficient of variation (CV)0.5503148756
Kurtosis-1.052039502
Mean11.16030534
Median Absolute Deviation (MAD)5
Skewness0.07958469032
Sum1462
Variance37.72025837
MonotonicityNot monotonic
2021-07-10T22:46:43.617877image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
510
 
7.6%
89
 
6.9%
109
 
6.9%
128
 
6.1%
148
 
6.1%
48
 
6.1%
178
 
6.1%
118
 
6.1%
167
 
5.3%
27
 
5.3%
Other values (13)49
37.4%
ValueCountFrequency (%)
15
3.8%
27
5.3%
34
 
3.1%
48
6.1%
510
7.6%
62
 
1.5%
74
 
3.1%
89
6.9%
93
 
2.3%
109
6.9%
ValueCountFrequency (%)
231
 
0.8%
226
4.6%
214
3.1%
203
 
2.3%
194
3.1%
184
3.1%
178
6.1%
167
5.3%
156
4.6%
148
6.1%

cod_pre
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6801990104
Minimum6800100431
Maximum6827601666
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:46:44.218827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum6800100431
5-th percentile6800100431
Q16800100431
median6800100792
Q36800101157
95-th percentile6827601192
Maximum6827601666
Range27501235
Interquartile range (IQR)726

Descriptive statistics

Standard deviation6982918.869
Coefficient of variation (CV)0.001026599387
Kurtosis10.05445947
Mean6801990104
Median Absolute Deviation (MAD)361
Skewness3.449807693
Sum8.910607036 × 1011
Variance4.876115593 × 1013
MonotonicityNot monotonic
2021-07-10T22:46:44.818744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
680010079243
32.8%
680010043140
30.5%
680010115727
20.6%
680010070112
 
9.2%
68276016667
 
5.3%
68276007171
 
0.8%
68276002891
 
0.8%
ValueCountFrequency (%)
680010043140
30.5%
680010070112
 
9.2%
680010079243
32.8%
680010115727
20.6%
68276002891
 
0.8%
68276007171
 
0.8%
68276016667
 
5.3%
ValueCountFrequency (%)
68276016667
 
5.3%
68276007171
 
0.8%
68276002891
 
0.8%
680010115727
20.6%
680010079243
32.8%
680010070112
 
9.2%
680010043140
30.5%

tip_ide_
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
CC
96 
CE
14 
TI
12 
MS
 
6
AS
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters262
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st rowCC
2nd rowCC
3rd rowCC
4th rowCC
5th rowCC

Common Values

ValueCountFrequency (%)
CC96
73.3%
CE14
 
10.7%
TI12
 
9.2%
MS6
 
4.6%
AS2
 
1.5%
PE1
 
0.8%

Length

2021-07-10T22:46:46.131960image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:46.724441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
cc96
73.3%
ce14
 
10.7%
ti12
 
9.2%
ms6
 
4.6%
as2
 
1.5%
pe1
 
0.8%

Most occurring characters

ValueCountFrequency (%)
C206
78.6%
E15
 
5.7%
T12
 
4.6%
I12
 
4.6%
S8
 
3.1%
M6
 
2.3%
A2
 
0.8%
P1
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter262
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C206
78.6%
E15
 
5.7%
T12
 
4.6%
I12
 
4.6%
S8
 
3.1%
M6
 
2.3%
A2
 
0.8%
P1
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin262
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C206
78.6%
E15
 
5.7%
T12
 
4.6%
I12
 
4.6%
S8
 
3.1%
M6
 
2.3%
A2
 
0.8%
P1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C206
78.6%
E15
 
5.7%
T12
 
4.6%
I12
 
4.6%
S8
 
3.1%
M6
 
2.3%
A2
 
0.8%
P1
 
0.4%

num_ide_
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct131
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1007665960
 
1
1005484877
 
1
VEN27626337
 
1
1095929685
 
1
1101048096
 
1
Other values (126)
126 

Length

Max length18
Median length10
Mean length10.09160305
Min length8

Characters and Unicode

Total characters1322
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)100.0%

Sample

1st row37372773
2nd row1098713904
3rd row1099940520
4th row1000251004
5th row1000251004-1

Common Values

ValueCountFrequency (%)
10076659601
 
0.8%
10054848771
 
0.8%
VEN276263371
 
0.8%
10959296851
 
0.8%
11010480961
 
0.8%
10051047841
 
0.8%
10987109371
 
0.8%
378644141
 
0.8%
10051365051
 
0.8%
11931495021
 
0.8%
Other values (121)121
92.4%

Length

2021-07-10T22:46:48.036410image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10076659601
 
0.8%
10987492941
 
0.8%
10051047841
 
0.8%
10987109371
 
0.8%
378644141
 
0.8%
10051365051
 
0.8%
ven7214541020919911
 
0.8%
11931495021
 
0.8%
10659021031
 
0.8%
10986318931
 
0.8%
Other values (121)121
92.4%

Most occurring characters

ValueCountFrequency (%)
0213
16.1%
1210
15.9%
9139
10.5%
8125
9.5%
298
7.4%
597
7.3%
396
7.3%
796
7.3%
689
6.7%
484
 
6.4%
Other values (4)75
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1247
94.3%
Uppercase Letter69
 
5.2%
Dash Punctuation6
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0213
17.1%
1210
16.8%
9139
11.1%
8125
10.0%
298
7.9%
597
7.8%
396
7.7%
796
7.7%
689
7.1%
484
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
V23
33.3%
E23
33.3%
N23
33.3%
Dash Punctuation
ValueCountFrequency (%)
-6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1253
94.8%
Latin69
 
5.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0213
17.0%
1210
16.8%
9139
11.1%
8125
10.0%
298
7.8%
597
7.7%
396
7.7%
796
7.7%
689
7.1%
484
 
6.7%
Latin
ValueCountFrequency (%)
V23
33.3%
E23
33.3%
N23
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0213
16.1%
1210
15.9%
9139
10.5%
8125
9.5%
298
7.4%
597
7.3%
396
7.3%
796
7.3%
689
6.7%
484
 
6.4%
Other values (4)75
 
5.7%

edad_
Real number (ℝ≥0)

HIGH CORRELATION

Distinct29
Distinct (%)22.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.76335878
Minimum15
Maximum47
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:46:48.716757image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile16
Q120
median24
Q330
95-th percentile38.5
Maximum47
Range32
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.362934262
Coefficient of variation (CV)0.2857909299
Kurtosis-0.185736991
Mean25.76335878
Median Absolute Deviation (MAD)5
Skewness0.6570749812
Sum3375
Variance54.21280094
MonotonicityNot monotonic
2021-07-10T22:46:49.416404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
2015
 
11.5%
2910
 
7.6%
2410
 
7.6%
239
 
6.9%
179
 
6.9%
257
 
5.3%
345
 
3.8%
265
 
3.8%
155
 
3.8%
215
 
3.8%
Other values (19)51
38.9%
ValueCountFrequency (%)
155
 
3.8%
164
 
3.1%
179
6.9%
184
 
3.1%
193
 
2.3%
2015
11.5%
215
 
3.8%
224
 
3.1%
239
6.9%
2410
7.6%
ValueCountFrequency (%)
471
 
0.8%
452
 
1.5%
421
 
0.8%
401
 
0.8%
392
 
1.5%
382
 
1.5%
374
3.1%
364
3.1%
352
 
1.5%
345
3.8%

nacionali_
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
170
108 
862
23 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters393
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row170
2nd row170
3rd row170
4th row170
5th row170

Common Values

ValueCountFrequency (%)
170108
82.4%
86223
 
17.6%

Length

2021-07-10T22:46:50.628888image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:51.036542image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
170108
82.4%
86223
 
17.6%

Most occurring characters

ValueCountFrequency (%)
1108
27.5%
7108
27.5%
0108
27.5%
823
 
5.9%
623
 
5.9%
223
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number393
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1108
27.5%
7108
27.5%
0108
27.5%
823
 
5.9%
623
 
5.9%
223
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common393
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1108
27.5%
7108
27.5%
0108
27.5%
823
 
5.9%
623
 
5.9%
223
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII393
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1108
27.5%
7108
27.5%
0108
27.5%
823
 
5.9%
623
 
5.9%
223
 
5.9%

nombre_nacionalidad
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
COLOMBIA
108 
VENEZUELA
23 

Length

Max length9
Median length8
Mean length8.175572519
Min length8

Characters and Unicode

Total characters1071
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOLOMBIA
2nd rowCOLOMBIA
3rd rowCOLOMBIA
4th rowCOLOMBIA
5th rowCOLOMBIA

Common Values

ValueCountFrequency (%)
COLOMBIA108
82.4%
VENEZUELA23
 
17.6%

Length

2021-07-10T22:46:52.025325image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:52.520683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
colombia108
82.4%
venezuela23
 
17.6%

Most occurring characters

ValueCountFrequency (%)
O216
20.2%
L131
12.2%
A131
12.2%
C108
10.1%
M108
10.1%
B108
10.1%
I108
10.1%
E69
 
6.4%
V23
 
2.1%
N23
 
2.1%
Other values (2)46
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1071
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O216
20.2%
L131
12.2%
A131
12.2%
C108
10.1%
M108
10.1%
B108
10.1%
I108
10.1%
E69
 
6.4%
V23
 
2.1%
N23
 
2.1%
Other values (2)46
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Latin1071
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O216
20.2%
L131
12.2%
A131
12.2%
C108
10.1%
M108
10.1%
B108
10.1%
I108
10.1%
E69
 
6.4%
V23
 
2.1%
N23
 
2.1%
Other values (2)46
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1071
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O216
20.2%
L131
12.2%
A131
12.2%
C108
10.1%
M108
10.1%
B108
10.1%
I108
10.1%
E69
 
6.4%
V23
 
2.1%
N23
 
2.1%
Other values (2)46
 
4.3%

cod_dpto_o
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
68
125 
20
 
3
13
 
2
54
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters262
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row68
2nd row68
3rd row68
4th row68
5th row68

Common Values

ValueCountFrequency (%)
68125
95.4%
203
 
2.3%
132
 
1.5%
541
 
0.8%

Length

2021-07-10T22:46:53.531199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:54.020621image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
68125
95.4%
203
 
2.3%
132
 
1.5%
541
 
0.8%

Most occurring characters

ValueCountFrequency (%)
6125
47.7%
8125
47.7%
23
 
1.1%
03
 
1.1%
12
 
0.8%
32
 
0.8%
51
 
0.4%
41
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number262
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6125
47.7%
8125
47.7%
23
 
1.1%
03
 
1.1%
12
 
0.8%
32
 
0.8%
51
 
0.4%
41
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common262
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6125
47.7%
8125
47.7%
23
 
1.1%
03
 
1.1%
12
 
0.8%
32
 
0.8%
51
 
0.4%
41
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6125
47.7%
8125
47.7%
23
 
1.1%
03
 
1.1%
12
 
0.8%
32
 
0.8%
51
 
0.4%
41
 
0.4%

cod_mun_o
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct21
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean224.2366412
Minimum1
Maximum855
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:46:54.425413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median128
Q3368
95-th percentile679.5
Maximum855
Range854
Interquartile range (IQR)367

Descriptive statistics

Standard deviation256.5955462
Coefficient of variation (CV)1.144306947
Kurtosis-0.662124686
Mean224.2366412
Median Absolute Deviation (MAD)127
Skewness0.7642781076
Sum29375
Variance65841.27434
MonotonicityNot monotonic
2021-07-10T22:46:55.031166image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
160
45.8%
27622
 
16.8%
54715
 
11.5%
3077
 
5.3%
8204
 
3.1%
113
 
2.3%
6153
 
2.3%
5722
 
1.5%
6892
 
1.5%
6702
 
1.5%
Other values (11)11
 
8.4%
ValueCountFrequency (%)
160
45.8%
113
 
2.3%
201
 
0.8%
791
 
0.8%
1281
 
0.8%
1901
 
0.8%
27622
 
16.8%
2981
 
0.8%
3077
 
5.3%
3181
 
0.8%
ValueCountFrequency (%)
8551
 
0.8%
8204
 
3.1%
6892
 
1.5%
6702
 
1.5%
6551
 
0.8%
6153
 
2.3%
5722
 
1.5%
54715
11.5%
4641
 
0.8%
4441
 
0.8%

area_
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
110 
3
19 
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

Length

2021-07-10T22:46:56.334615image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:56.820592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

Most occurring characters

ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1110
84.0%
319
 
14.5%
22
 
1.5%

ocupacion_
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9899.198473
Minimum3221
Maximum9999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:46:57.130028image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum3221
5-th percentile9996
Q19996
median9997
Q39999
95-th percentile9999
Maximum9999
Range6778
Interquartile range (IQR)3

Descriptive statistics

Standard deviation764.2550865
Coefficient of variation (CV)0.07720373408
Kurtosis65.19398544
Mean9899.198473
Median Absolute Deviation (MAD)2
Skewness-8.090420734
Sum1296795
Variance584085.8372
MonotonicityNot monotonic
2021-07-10T22:46:57.720220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
999963
48.1%
999657
43.5%
99975
 
3.8%
99503
 
2.3%
44191
 
0.8%
96111
 
0.8%
32211
 
0.8%
ValueCountFrequency (%)
32211
 
0.8%
44191
 
0.8%
96111
 
0.8%
99503
 
2.3%
999657
43.5%
99975
 
3.8%
999963
48.1%
ValueCountFrequency (%)
999963
48.1%
99975
 
3.8%
999657
43.5%
99503
 
2.3%
96111
 
0.8%
44191
 
0.8%
32211
 
0.8%

tip_ss_
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
C
59 
S
47 
N
23 
P
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowS
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
C59
45.0%
S47
35.9%
N23
 
17.6%
P2
 
1.5%

Length

2021-07-10T22:46:59.122622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:46:59.618673image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
c59
45.0%
s47
35.9%
n23
 
17.6%
p2
 
1.5%

Most occurring characters

ValueCountFrequency (%)
C59
45.0%
S47
35.9%
N23
 
17.6%
P2
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter131
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C59
45.0%
S47
35.9%
N23
 
17.6%
P2
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Latin131
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C59
45.0%
S47
35.9%
N23
 
17.6%
P2
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C59
45.0%
S47
35.9%
N23
 
17.6%
P2
 
1.5%

cod_ase_
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct22
Distinct (%)20.4%
Missing23
Missing (%)17.6%
Memory size1.1 KiB
EPS005
15 
EPS002
14 
EPSS41
12 
ESS133
10 
ESS024
Other values (17)
50 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters648
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)8.3%

Sample

1st rowEPSS41
2nd rowEPSS41
3rd rowESS133
4th rowESS024
5th rowESS024

Common Values

ValueCountFrequency (%)
EPS00515
11.5%
EPS00214
10.7%
EPSS4112
9.2%
ESS13310
7.6%
ESS0247
 
5.3%
ESS0627
 
5.3%
EPS0177
 
5.3%
EPS0376
 
4.6%
EPS0106
 
4.6%
EPSS376
 
4.6%
Other values (12)18
13.7%
(Missing)23
17.6%

Length

2021-07-10T22:47:00.717706image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
eps00515
13.9%
eps00214
13.0%
epss4112
11.1%
ess13310
9.3%
eps0177
 
6.5%
ess0247
 
6.5%
ess0627
 
6.5%
eps0376
 
5.6%
epss376
 
5.6%
eps0106
 
5.6%
Other values (12)18
16.7%

Most occurring characters

ValueCountFrequency (%)
S158
24.4%
0111
17.1%
E108
16.7%
P78
12.0%
142
 
6.5%
339
 
6.0%
231
 
4.8%
422
 
3.4%
720
 
3.1%
516
 
2.5%
Other values (5)23
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter351
54.2%
Decimal Number297
45.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0111
37.4%
142
 
14.1%
339
 
13.1%
231
 
10.4%
422
 
7.4%
720
 
6.7%
516
 
5.4%
615
 
5.1%
91
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
S158
45.0%
E108
30.8%
P78
22.2%
C4
 
1.1%
R2
 
0.6%
M1
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin351
54.2%
Common297
45.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0111
37.4%
142
 
14.1%
339
 
13.1%
231
 
10.4%
422
 
7.4%
720
 
6.7%
516
 
5.4%
615
 
5.1%
91
 
0.3%
Latin
ValueCountFrequency (%)
S158
45.0%
E108
30.8%
P78
22.2%
C4
 
1.1%
R2
 
0.6%
M1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S158
24.4%
0111
17.1%
E108
16.7%
P78
12.0%
142
 
6.5%
339
 
6.0%
231
 
4.8%
422
 
3.4%
720
 
3.1%
516
 
2.5%
Other values (5)23
 
3.5%

estrato_
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)6.2%
Missing67
Missing (%)51.1%
Memory size1.1 KiB
2.0
28 
3.0
22 
1.0
13 
4.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters192
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)1.6%

Sample

1st row1.0
2nd row1.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.028
21.4%
3.022
 
16.8%
1.013
 
9.9%
4.01
 
0.8%
(Missing)67
51.1%

Length

2021-07-10T22:47:01.931791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:02.427438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
2.028
43.8%
3.022
34.4%
1.013
20.3%
4.01
 
1.6%

Most occurring characters

ValueCountFrequency (%)
.64
33.3%
064
33.3%
228
14.6%
322
 
11.5%
113
 
6.8%
41
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number128
66.7%
Other Punctuation64
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
064
50.0%
228
21.9%
322
 
17.2%
113
 
10.2%
41
 
0.8%
Other Punctuation
ValueCountFrequency (%)
.64
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common192
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.64
33.3%
064
33.3%
228
14.6%
322
 
11.5%
113
 
6.8%
41
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.64
33.3%
064
33.3%
228
14.6%
322
 
11.5%
113
 
6.8%
41
 
0.5%

gp_migrant
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2
108 
1
23 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

Length

2021-07-10T22:47:03.427235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:03.836313image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

Most occurring characters

ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2108
82.4%
123
 
17.6%

gp_gestan
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2
107 
1
24 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

Length

2021-07-10T22:47:04.832356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:05.321851image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

Most occurring characters

ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2107
81.7%
124
 
18.3%

sem_ges_
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)16.7%
Missing107
Missing (%)81.7%
Memory size1.1 KiB
37.0
17 
38.0
39.0
32.0
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters96
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)4.2%

Sample

1st row38.0
2nd row37.0
3rd row38.0
4th row37.0
5th row39.0

Common Values

ValueCountFrequency (%)
37.017
 
13.0%
38.03
 
2.3%
39.03
 
2.3%
32.01
 
0.8%
(Missing)107
81.7%

Length

2021-07-10T22:47:06.336533image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:06.823964image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
37.017
70.8%
38.03
 
12.5%
39.03
 
12.5%
32.01
 
4.2%

Most occurring characters

ValueCountFrequency (%)
324
25.0%
.24
25.0%
024
25.0%
717
17.7%
83
 
3.1%
93
 
3.1%
21
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number72
75.0%
Other Punctuation24
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
324
33.3%
024
33.3%
717
23.6%
83
 
4.2%
93
 
4.2%
21
 
1.4%
Other Punctuation
ValueCountFrequency (%)
.24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common96
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
324
25.0%
.24
25.0%
024
25.0%
717
17.7%
83
 
3.1%
93
 
3.1%
21
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII96
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
324
25.0%
.24
25.0%
024
25.0%
717
17.7%
83
 
3.1%
93
 
3.1%
21
 
1.0%

gp_otros
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
112 
2
19 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

Length

2021-07-10T22:47:07.823047image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:08.236282image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

Most occurring characters

ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1112
85.5%
219
 
14.5%

cod_dpto_r
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
68
126 
20
 
3
54
 
1
13
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters262
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.5%

Sample

1st row68
2nd row68
3rd row68
4th row68
5th row68

Common Values

ValueCountFrequency (%)
68126
96.2%
203
 
2.3%
541
 
0.8%
131
 
0.8%

Length

2021-07-10T22:47:10.936938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:11.425436image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
68126
96.2%
203
 
2.3%
541
 
0.8%
131
 
0.8%

Most occurring characters

ValueCountFrequency (%)
6126
48.1%
8126
48.1%
23
 
1.1%
03
 
1.1%
11
 
0.4%
31
 
0.4%
51
 
0.4%
41
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number262
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6126
48.1%
8126
48.1%
23
 
1.1%
03
 
1.1%
11
 
0.4%
31
 
0.4%
51
 
0.4%
41
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common262
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6126
48.1%
8126
48.1%
23
 
1.1%
03
 
1.1%
11
 
0.4%
31
 
0.4%
51
 
0.4%
41
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6126
48.1%
8126
48.1%
23
 
1.1%
03
 
1.1%
11
 
0.4%
31
 
0.4%
51
 
0.4%
41
 
0.4%

cod_mun_r
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200.9847328
Minimum1
Maximum855
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:11.830635image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q3307
95-th percentile635
Maximum855
Range854
Interquartile range (IQR)306

Descriptive statistics

Standard deviation246.0838501
Coefficient of variation (CV)1.224390762
Kurtosis-0.5551265459
Mean200.9847328
Median Absolute Deviation (MAD)0
Skewness0.8519519219
Sum26329
Variance60557.2613
MonotonicityNot monotonic
2021-07-10T22:47:12.437129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
166
50.4%
27620
 
15.3%
54715
 
11.5%
3077
 
5.3%
113
 
2.3%
6153
 
2.3%
8202
 
1.5%
5722
 
1.5%
6892
 
1.5%
6551
 
0.8%
Other values (10)10
 
7.6%
ValueCountFrequency (%)
166
50.4%
113
 
2.3%
201
 
0.8%
791
 
0.8%
1281
 
0.8%
27620
 
15.3%
2981
 
0.8%
3077
 
5.3%
3181
 
0.8%
4181
 
0.8%
ValueCountFrequency (%)
8551
 
0.8%
8202
 
1.5%
6892
 
1.5%
6701
 
0.8%
6551
 
0.8%
6153
 
2.3%
5722
 
1.5%
54715
11.5%
4641
 
0.8%
4441
 
0.8%

fec_con_
Date

HIGH CORRELATION

Distinct89
Distinct (%)67.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2021-01-06 00:00:00
Maximum2021-06-06 00:00:00
2021-07-10T22:47:13.132442image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:47:14.022291image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

pac_hos_
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
121 
2
 
10

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

Length

2021-07-10T22:47:15.428128image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:15.838305image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

Most occurring characters

ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1121
92.4%
210
 
7.6%

fec_hos_
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size1.1 KiB
Distinct127
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum1973-12-12 00:00:00
Maximum2006-04-06 00:00:00
2021-07-10T22:47:16.326262image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:47:17.136679image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

fec_aju_
Date

HIGH CORRELATION

Distinct67
Distinct (%)51.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2021-01-09 00:00:00
Maximum2021-06-21 00:00:00
2021-07-10T22:47:17.933481image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:47:18.729127image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

nit_upgd
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)3.3%
Missing9
Missing (%)6.9%
Memory size1.1 KiB
9000060374.0
43 
8902087588.0
40 
8902096989.0
27 
8000842062.0
12 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters1464
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9000060374.0
2nd row8000842062.0
3rd row8000842062.0
4th row9000060374.0
5th row9000060374.0

Common Values

ValueCountFrequency (%)
9000060374.043
32.8%
8902087588.040
30.5%
8902096989.027
20.6%
8000842062.012
 
9.2%
(Missing)9
 
6.9%

Length

2021-07-10T22:47:20.225806image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:20.727291image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
9000060374.043
35.2%
8902087588.040
32.8%
8902096989.027
22.1%
8000842062.012
 
9.8%

Most occurring characters

ValueCountFrequency (%)
0519
35.5%
8238
16.3%
9191
 
13.0%
.122
 
8.3%
291
 
6.2%
783
 
5.7%
682
 
5.6%
455
 
3.8%
343
 
2.9%
540
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1342
91.7%
Other Punctuation122
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0519
38.7%
8238
17.7%
9191
 
14.2%
291
 
6.8%
783
 
6.2%
682
 
6.1%
455
 
4.1%
343
 
3.2%
540
 
3.0%
Other Punctuation
ValueCountFrequency (%)
.122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1464
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0519
35.5%
8238
16.3%
9191
 
13.0%
.122
 
8.3%
291
 
6.2%
783
 
5.7%
682
 
5.6%
455
 
3.8%
343
 
2.9%
540
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1464
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0519
35.5%
8238
16.3%
9191
 
13.0%
.122
 
8.3%
291
 
6.2%
783
 
5.7%
682
 
5.6%
455
 
3.8%
343
 
2.9%
540
 
2.7%

version
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
SIVIGILA - 2018 -18.2.0$0$1.0
90 
SIVIGILA - 2018 - 18.2.0
24 
SIVIGILA - 2018 -18.1.10$0$0.2
12 
SIVIGILA - 2018 - 18.1.10
 
3
SIVIGILA - 2018 -18.3.0$0$0.1
 
2

Length

Max length30
Median length29
Mean length28.08396947
Min length24

Characters and Unicode

Total characters3679
Distinct characters15
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSIVIGILA - 2018 -18.2.0$0$1.0
2nd rowSIVIGILA - 2018 - 18.2.0
3rd rowSIVIGILA - 2018 - 18.2.0
4th rowSIVIGILA - 2018 -18.1.10$0$0.2
5th rowSIVIGILA - 2018 -18.1.10$0$0.2

Common Values

ValueCountFrequency (%)
SIVIGILA - 2018 -18.2.0$0$1.090
68.7%
SIVIGILA - 2018 - 18.2.024
 
18.3%
SIVIGILA - 2018 -18.1.10$0$0.212
 
9.2%
SIVIGILA - 2018 - 18.1.103
 
2.3%
SIVIGILA - 2018 -18.3.0$0$0.12
 
1.5%

Length

2021-07-10T22:47:21.820101image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:22.327862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
158
28.7%
2018131
23.8%
sivigila131
23.8%
18.2.0$0$1.090
16.3%
18.2.024
 
4.4%
18.1.10$0$0.212
 
2.2%
18.1.103
 
0.5%
18.3.0$0$0.12
 
0.4%

Most occurring characters

ValueCountFrequency (%)
0470
12.8%
420
11.4%
I393
10.7%
1384
10.4%
.366
9.9%
-262
7.1%
8262
7.1%
2257
7.0%
$208
 
5.7%
S131
 
3.6%
Other values (5)526
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1375
37.4%
Uppercase Letter1048
28.5%
Space Separator420
 
11.4%
Other Punctuation366
 
9.9%
Dash Punctuation262
 
7.1%
Currency Symbol208
 
5.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I393
37.5%
S131
 
12.5%
V131
 
12.5%
G131
 
12.5%
L131
 
12.5%
A131
 
12.5%
Decimal Number
ValueCountFrequency (%)
0470
34.2%
1384
27.9%
8262
19.1%
2257
18.7%
32
 
0.1%
Space Separator
ValueCountFrequency (%)
420
100.0%
Dash Punctuation
ValueCountFrequency (%)
-262
100.0%
Other Punctuation
ValueCountFrequency (%)
.366
100.0%
Currency Symbol
ValueCountFrequency (%)
$208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2631
71.5%
Latin1048
 
28.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0470
17.9%
420
16.0%
1384
14.6%
.366
13.9%
-262
10.0%
8262
10.0%
2257
9.8%
$208
7.9%
32
 
0.1%
Latin
ValueCountFrequency (%)
I393
37.5%
S131
 
12.5%
V131
 
12.5%
G131
 
12.5%
L131
 
12.5%
A131
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII3679
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0470
12.8%
420
11.4%
I393
10.7%
1384
10.4%
.366
9.9%
-262
7.1%
8262
7.1%
2257
7.0%
$208
 
5.7%
S131
 
3.6%
Other values (5)526
14.3%

tip_doc_rn
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
MS
86 
CN
44 
PE
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters262
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st rowMS
2nd rowMS
3rd rowMS
4th rowMS
5th rowMS

Common Values

ValueCountFrequency (%)
MS86
65.6%
CN44
33.6%
PE1
 
0.8%

Length

2021-07-10T22:47:23.932505image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:24.420391image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
ms86
65.6%
cn44
33.6%
pe1
 
0.8%

Most occurring characters

ValueCountFrequency (%)
M86
32.8%
S86
32.8%
C44
16.8%
N44
16.8%
P1
 
0.4%
E1
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter262
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M86
32.8%
S86
32.8%
C44
16.8%
N44
16.8%
P1
 
0.4%
E1
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin262
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M86
32.8%
S86
32.8%
C44
16.8%
N44
16.8%
P1
 
0.4%
E1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M86
32.8%
S86
32.8%
C44
16.8%
N44
16.8%
P1
 
0.4%
E1
 
0.4%

no_ide_rn
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct131
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
164452072
 
1
164484846
 
1
163793919
 
1
1005136505
 
1
1005307711-1
 
1
Other values (126)
126 

Length

Max length15
Median length10
Mean length10.48091603
Min length8

Characters and Unicode

Total characters1373
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique131 ?
Unique (%)100.0%

Sample

1st row37372773-4
2nd row1098713904-1
3rd row1099940520-2
4th row1000251004-2
5th row1000251004-1

Common Values

ValueCountFrequency (%)
1644520721
 
0.8%
1644848461
 
0.8%
1637939191
 
0.8%
10051365051
 
0.8%
1005307711-11
 
0.8%
1644676401
 
0.8%
1644840771
 
0.8%
1101200798-11
 
0.8%
1098409345-11
 
0.8%
109868566211
 
0.8%
Other values (121)121
92.4%

Length

2021-07-10T22:47:25.717503image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ven277381841
 
0.8%
1095832837-11
 
0.8%
1644873581
 
0.8%
1637939191
 
0.8%
10051365051
 
0.8%
1005307711-11
 
0.8%
1644676401
 
0.8%
1644840771
 
0.8%
1101200798-11
 
0.8%
1098409345-11
 
0.8%
Other values (121)121
92.4%

Most occurring characters

ValueCountFrequency (%)
1259
18.9%
4154
11.2%
6153
11.1%
0138
10.1%
8111
8.1%
3102
 
7.4%
9102
 
7.4%
791
 
6.6%
586
 
6.3%
283
 
6.0%
Other values (5)94
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1279
93.2%
Dash Punctuation60
 
4.4%
Uppercase Letter33
 
2.4%
Space Separator1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1259
20.3%
4154
12.0%
6153
12.0%
0138
10.8%
8111
8.7%
3102
 
8.0%
9102
 
8.0%
791
 
7.1%
586
 
6.7%
283
 
6.5%
Uppercase Letter
ValueCountFrequency (%)
V11
33.3%
E11
33.3%
N11
33.3%
Dash Punctuation
ValueCountFrequency (%)
-60
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1340
97.6%
Latin33
 
2.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1259
19.3%
4154
11.5%
6153
11.4%
0138
10.3%
8111
8.3%
3102
 
7.6%
9102
 
7.6%
791
 
6.8%
586
 
6.4%
283
 
6.2%
Other values (2)61
 
4.6%
Latin
ValueCountFrequency (%)
V11
33.3%
E11
33.3%
N11
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1373
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1259
18.9%
4154
11.2%
6153
11.1%
0138
10.1%
8111
8.1%
3102
 
7.4%
9102
 
7.4%
791
 
6.6%
586
 
6.3%
283
 
6.0%
Other values (5)94
 
6.8%

fecha_nac
Date

HIGH CORRELATION

Distinct90
Distinct (%)68.7%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2021-01-06 00:00:00
Maximum2021-06-06 00:00:00
2021-07-10T22:47:26.418625image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:47:27.322163image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

edad_rn
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct21
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.557251908
Minimum0
Maximum45
Zeros5
Zeros (%)3.8%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:28.117958image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q37
95-th percentile16
Maximum45
Range45
Interquartile range (IQR)5

Descriptive statistics

Standard deviation6.614855432
Coefficient of variation (CV)1.190310524
Kurtosis15.04142209
Mean5.557251908
Median Absolute Deviation (MAD)2
Skewness3.411140944
Sum728
Variance43.75631239
MonotonicityNot monotonic
2021-07-10T22:47:28.727462image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
123
17.6%
218
13.7%
318
13.7%
512
9.2%
411
8.4%
710
7.6%
68
 
6.1%
05
 
3.8%
115
 
3.8%
84
 
3.1%
Other values (11)17
13.0%
ValueCountFrequency (%)
05
 
3.8%
123
17.6%
218
13.7%
318
13.7%
411
8.4%
512
9.2%
68
 
6.1%
710
7.6%
84
 
3.1%
92
 
1.5%
ValueCountFrequency (%)
451
 
0.8%
381
 
0.8%
331
 
0.8%
241
 
0.8%
171
 
0.8%
163
2.3%
151
 
0.8%
141
 
0.8%
122
 
1.5%
115
3.8%

sexo
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
F
79 
M
52 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F79
60.3%
M52
39.7%

Length

2021-07-10T22:47:30.016157image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:30.424837image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
f79
60.3%
m52
39.7%

Most occurring characters

ValueCountFrequency (%)
F79
60.3%
M52
39.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter131
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F79
60.3%
M52
39.7%

Most occurring scripts

ValueCountFrequency (%)
Latin131
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F79
60.3%
M52
39.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F79
60.3%
M52
39.7%

peso_nacer
Real number (ℝ≥0)

HIGH CORRELATION

Distinct71
Distinct (%)54.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2302.603053
Minimum1655
Maximum2490
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:30.917033image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1655
5-th percentile1997.5
Q12220
median2335
Q32440
95-th percentile2480
Maximum2490
Range835
Interquartile range (IQR)220

Descriptive statistics

Standard deviation164.8139128
Coefficient of variation (CV)0.07157721454
Kurtosis2.40181567
Mean2302.603053
Median Absolute Deviation (MAD)110
Skewness-1.387742201
Sum301641
Variance27163.62584
MonotonicityNot monotonic
2021-07-10T22:47:31.630146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23756
 
4.6%
21006
 
4.6%
22705
 
3.8%
24705
 
3.8%
24654
 
3.1%
24804
 
3.1%
24754
 
3.1%
24503
 
2.3%
24853
 
2.3%
22553
 
2.3%
Other values (61)88
67.2%
ValueCountFrequency (%)
16551
0.8%
17551
0.8%
18001
0.8%
18101
0.8%
19101
0.8%
19301
0.8%
19651
0.8%
20301
0.8%
20401
0.8%
20801
0.8%
ValueCountFrequency (%)
24903
2.3%
24853
2.3%
24804
3.1%
24754
3.1%
24705
3.8%
24654
3.1%
24603
2.3%
24551
 
0.8%
24503
2.3%
24451
 
0.8%

talla_nace
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.16030534
Minimum42
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:32.323562image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum42
5-th percentile44
Q146
median47
Q348
95-th percentile50
Maximum52
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.858729069
Coefficient of variation (CV)0.03941299904
Kurtosis0.2035621377
Mean47.16030534
Median Absolute Deviation (MAD)1
Skewness-0.009774317438
Sum6178
Variance3.454873752
MonotonicityNot monotonic
2021-07-10T22:47:32.936921image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
4832
24.4%
4729
22.1%
4517
13.0%
4617
13.0%
4910
 
7.6%
507
 
5.3%
444
 
3.1%
433
 
2.3%
513
 
2.3%
49.53
 
2.3%
Other values (4)6
 
4.6%
ValueCountFrequency (%)
421
 
0.8%
433
 
2.3%
444
 
3.1%
4517
13.0%
45.52
 
1.5%
4617
13.0%
4729
22.1%
4832
24.4%
48.51
 
0.8%
4910
 
7.6%
ValueCountFrequency (%)
522
 
1.5%
513
 
2.3%
507
 
5.3%
49.53
 
2.3%
4910
 
7.6%
48.51
 
0.8%
4832
24.4%
4729
22.1%
4617
13.0%
45.52
 
1.5%

sem_gest
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
37
98 
38
20 
39
11 
40
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters262
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row37
2nd row37
3rd row39
4th row37
5th row37

Common Values

ValueCountFrequency (%)
3798
74.8%
3820
 
15.3%
3911
 
8.4%
402
 
1.5%

Length

2021-07-10T22:47:34.232803image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:34.721197image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
3798
74.8%
3820
 
15.3%
3911
 
8.4%
402
 
1.5%

Most occurring characters

ValueCountFrequency (%)
3129
49.2%
798
37.4%
820
 
7.6%
911
 
4.2%
42
 
0.8%
02
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number262
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3129
49.2%
798
37.4%
820
 
7.6%
911
 
4.2%
42
 
0.8%
02
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common262
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3129
49.2%
798
37.4%
820
 
7.6%
911
 
4.2%
42
 
0.8%
02
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3129
49.2%
798
37.4%
820
 
7.6%
911
 
4.2%
42
 
0.8%
02
 
0.8%

mult_embar
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
124 
2
 
7

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

Length

2021-07-10T22:47:35.816128image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:36.228481image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

Most occurring characters

ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1124
94.7%
27
 
5.3%

num_em_pre
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.007633588
Minimum0
Maximum5
Zeros62
Zeros (%)47.3%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:36.527079image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.205731079
Coefficient of variation (CV)1.196596752
Kurtosis0.274921461
Mean1.007633588
Median Absolute Deviation (MAD)1
Skewness1.054711092
Sum132
Variance1.453787434
MonotonicityNot monotonic
2021-07-10T22:47:37.132283image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
062
47.3%
131
23.7%
219
 
14.5%
314
 
10.7%
44
 
3.1%
51
 
0.8%
ValueCountFrequency (%)
062
47.3%
131
23.7%
219
 
14.5%
314
 
10.7%
44
 
3.1%
51
 
0.8%
ValueCountFrequency (%)
51
 
0.8%
44
 
3.1%
314
 
10.7%
219
 
14.5%
131
23.7%
062
47.3%

num_hi_viv
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
1
73 
2
38 
3
15 
4
 
4
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters131
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row4
2nd row1
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

Length

2021-07-10T22:47:38.519279image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:39.020368image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

Most occurring characters

ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number131
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common131
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
173
55.7%
238
29.0%
315
 
11.5%
44
 
3.1%
51
 
0.8%

niv_edu_ma
Categorical

HIGH CORRELATION
MISSING

Distinct4
Distinct (%)3.1%
Missing2
Missing (%)1.5%
Memory size1.1 KiB
2.0
64 
3.0
43 
1.0
21 
4.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters387
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row1.0
2nd row1.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.064
48.9%
3.043
32.8%
1.021
 
16.0%
4.01
 
0.8%
(Missing)2
 
1.5%

Length

2021-07-10T22:47:40.118135image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:40.618044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
2.064
49.6%
3.043
33.3%
1.021
 
16.3%
4.01
 
0.8%

Most occurring characters

ValueCountFrequency (%)
.129
33.3%
0129
33.3%
264
16.5%
343
 
11.1%
121
 
5.4%
41
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number258
66.7%
Other Punctuation129
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0129
50.0%
264
24.8%
343
 
16.7%
121
 
8.1%
41
 
0.4%
Other Punctuation
ValueCountFrequency (%)
.129
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common387
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.129
33.3%
0129
33.3%
264
16.5%
343
 
11.1%
121
 
5.4%
41
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII387
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.129
33.3%
0129
33.3%
264
16.5%
343
 
11.1%
121
 
5.4%
41
 
0.3%

nom_upgd
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)4.1%
Missing9
Missing (%)6.9%
Memory size1.1 KiB
HOSPITAL UNIVERSITARIO DE SANTANDER
43 
CLINICA MATERNO INFANTIL SAN LUIS SA
40 
CLINICA CHICAMOCHA SA
27 
HOSPITAL LOCAL DEL NORTE
10 
UIMIST
 
2

Length

Max length36
Median length35
Mean length30.85245902
Min length6

Characters and Unicode

Total characters3764
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHOSPITAL UNIVERSITARIO DE SANTANDER
2nd rowHOSPITAL LOCAL DEL NORTE
3rd rowHOSPITAL LOCAL DEL NORTE
4th rowHOSPITAL UNIVERSITARIO DE SANTANDER
5th rowHOSPITAL UNIVERSITARIO DE SANTANDER

Common Values

ValueCountFrequency (%)
HOSPITAL UNIVERSITARIO DE SANTANDER43
32.8%
CLINICA MATERNO INFANTIL SAN LUIS SA40
30.5%
CLINICA CHICAMOCHA SA27
20.6%
HOSPITAL LOCAL DEL NORTE10
 
7.6%
UIMIST2
 
1.5%
(Missing)9
 
6.9%

Length

2021-07-10T22:47:41.721292image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:42.229991image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
sa67
12.5%
clinica67
12.5%
hospital53
9.9%
santander43
8.0%
universitario43
8.0%
de43
8.0%
san40
7.5%
materno40
7.5%
infantil40
7.5%
luis40
7.5%
Other values (5)59
11.0%

Most occurring characters

ValueCountFrequency (%)
A500
13.3%
I467
12.4%
413
11.0%
N366
9.7%
S288
7.7%
T231
 
6.1%
L230
 
6.1%
C225
 
6.0%
E189
 
5.0%
O183
 
4.9%
Other values (8)672
17.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3351
89.0%
Space Separator413
 
11.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A500
14.9%
I467
13.9%
N366
10.9%
S288
8.6%
T231
6.9%
L230
6.9%
C225
6.7%
E189
 
5.6%
O183
 
5.5%
R179
 
5.3%
Other values (7)493
14.7%
Space Separator
ValueCountFrequency (%)
413
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3351
89.0%
Common413
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A500
14.9%
I467
13.9%
N366
10.9%
S288
8.6%
T231
6.9%
L230
6.9%
C225
6.7%
E189
 
5.6%
O183
 
5.5%
R179
 
5.3%
Other values (7)493
14.7%
Common
ValueCountFrequency (%)
413
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3764
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A500
13.3%
I467
12.4%
413
11.0%
N366
9.7%
S288
7.7%
T231
 
6.1%
L230
 
6.1%
C225
 
6.0%
E189
 
5.0%
O183
 
4.9%
Other values (8)672
17.9%

ndep_proce
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
SANTANDER
125 
CESAR
 
3
BOLIVAR
 
2
NORTE SANTANDER
 
1

Length

Max length15
Median length9
Mean length8.923664122
Min length5

Characters and Unicode

Total characters1169
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st rowSANTANDER
2nd rowSANTANDER
3rd rowSANTANDER
4th rowSANTANDER
5th rowSANTANDER

Common Values

ValueCountFrequency (%)
SANTANDER125
95.4%
CESAR3
 
2.3%
BOLIVAR2
 
1.5%
NORTE SANTANDER1
 
0.8%

Length

2021-07-10T22:47:45.620614image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:46.122198image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
santander126
95.5%
cesar3
 
2.3%
bolivar2
 
1.5%
norte1
 
0.8%

Most occurring characters

ValueCountFrequency (%)
A257
22.0%
N253
21.6%
R132
11.3%
E130
11.1%
S129
11.0%
T127
10.9%
D126
10.8%
O3
 
0.3%
C3
 
0.3%
B2
 
0.2%
Other values (4)7
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1168
99.9%
Space Separator1
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A257
22.0%
N253
21.7%
R132
11.3%
E130
11.1%
S129
11.0%
T127
10.9%
D126
10.8%
O3
 
0.3%
C3
 
0.3%
B2
 
0.2%
Other values (3)6
 
0.5%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1168
99.9%
Common1
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A257
22.0%
N253
21.7%
R132
11.3%
E130
11.1%
S129
11.0%
T127
10.9%
D126
10.8%
O3
 
0.3%
C3
 
0.3%
B2
 
0.2%
Other values (3)6
 
0.5%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1169
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A257
22.0%
N253
21.6%
R132
11.3%
E130
11.1%
S129
11.0%
T127
10.9%
D126
10.8%
O3
 
0.3%
C3
 
0.3%
B2
 
0.2%
Other values (4)7
 
0.6%

nmun_proce
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct21
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
BUCARAMANGA
60 
FLORIDABLANCA
22 
PIEDECUESTA
15 
GIRON
TONA
 
4
Other values (16)
23 

Length

Max length22
Median length11
Mean length10.73282443
Min length4

Characters and Unicode

Total characters1406
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)8.4%

Sample

1st rowSAN VICENTE DE CHUCURI
2nd rowCIMITARRA
3rd rowTONA
4th rowPUENTE NACIONAL
5th rowPUENTE NACIONAL

Common Values

ValueCountFrequency (%)
BUCARAMANGA60
45.8%
FLORIDABLANCA22
 
16.8%
PIEDECUESTA15
 
11.5%
GIRON7
 
5.3%
TONA4
 
3.1%
RIONEGRO3
 
2.3%
AGUACHICA3
 
2.3%
PUENTE NACIONAL2
 
1.5%
SAN PABLO2
 
1.5%
SAN VICENTE DE CHUCURI2
 
1.5%
Other values (11)11
 
8.4%

Length

2021-07-10T22:47:47.420645image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bucaramanga60
40.8%
floridablanca22
 
15.0%
piedecuesta15
 
10.2%
giron7
 
4.8%
san5
 
3.4%
tona4
 
2.7%
de4
 
2.7%
rionegro3
 
2.0%
aguachica3
 
2.0%
chucuri2
 
1.4%
Other values (18)22
 
15.0%

Most occurring characters

ValueCountFrequency (%)
A367
26.1%
C116
 
8.3%
N113
 
8.0%
R104
 
7.4%
B88
 
6.3%
U85
 
6.0%
G76
 
5.4%
E64
 
4.6%
M64
 
4.6%
I62
 
4.4%
Other values (12)267
19.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1390
98.9%
Space Separator16
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A367
26.4%
C116
 
8.3%
N113
 
8.1%
R104
 
7.5%
B88
 
6.3%
U85
 
6.1%
G76
 
5.5%
E64
 
4.6%
M64
 
4.6%
I62
 
4.5%
Other values (11)251
18.1%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1390
98.9%
Common16
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A367
26.4%
C116
 
8.3%
N113
 
8.1%
R104
 
7.5%
B88
 
6.3%
U85
 
6.1%
G76
 
5.5%
E64
 
4.6%
M64
 
4.6%
I62
 
4.5%
Other values (11)251
18.1%
Common
ValueCountFrequency (%)
16
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1406
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A367
26.1%
C116
 
8.3%
N113
 
8.0%
R104
 
7.4%
B88
 
6.3%
U85
 
6.0%
G76
 
5.4%
E64
 
4.6%
M64
 
4.6%
I62
 
4.4%
Other values (12)267
19.0%

ndep_resi
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
SANTANDER
126 
CESAR
 
3
BOLIVAR
 
1
NORTE SANTANDER
 
1

Length

Max length15
Median length9
Mean length8.938931298
Min length5

Characters and Unicode

Total characters1171
Distinct characters14
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)1.5%

Sample

1st rowSANTANDER
2nd rowSANTANDER
3rd rowSANTANDER
4th rowSANTANDER
5th rowSANTANDER

Common Values

ValueCountFrequency (%)
SANTANDER126
96.2%
CESAR3
 
2.3%
BOLIVAR1
 
0.8%
NORTE SANTANDER1
 
0.8%

Length

2021-07-10T22:47:48.816360image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-10T22:47:49.227507image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
santander127
96.2%
cesar3
 
2.3%
norte1
 
0.8%
bolivar1
 
0.8%

Most occurring characters

ValueCountFrequency (%)
A258
22.0%
N255
21.8%
R132
11.3%
E131
11.2%
S130
11.1%
T128
10.9%
D127
10.8%
C3
 
0.3%
O2
 
0.2%
B1
 
0.1%
Other values (4)4
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1170
99.9%
Space Separator1
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A258
22.1%
N255
21.8%
R132
11.3%
E131
11.2%
S130
11.1%
T128
10.9%
D127
10.9%
C3
 
0.3%
O2
 
0.2%
B1
 
0.1%
Other values (3)3
 
0.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1170
99.9%
Common1
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A258
22.1%
N255
21.8%
R132
11.3%
E131
11.2%
S130
11.1%
T128
10.9%
D127
10.9%
C3
 
0.3%
O2
 
0.2%
B1
 
0.1%
Other values (3)3
 
0.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1171
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A258
22.0%
N255
21.8%
R132
11.3%
E131
11.2%
S130
11.1%
T128
10.9%
D127
10.8%
C3
 
0.3%
O2
 
0.2%
B1
 
0.1%
Other values (4)4
 
0.3%

nmun_resi
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
BUCARAMANGA
66 
FLORIDABLANCA
20 
PIEDECUESTA
15 
GIRON
RIONEGRO
 
3
Other values (15)
20 

Length

Max length22
Median length11
Mean length10.83969466
Min length4

Characters and Unicode

Total characters1420
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)8.4%

Sample

1st rowSAN VICENTE DE CHUCURI
2nd rowBUCARAMANGA
3rd rowBUCARAMANGA
4th rowPUENTE NACIONAL
5th rowPUENTE NACIONAL

Common Values

ValueCountFrequency (%)
BUCARAMANGA66
50.4%
FLORIDABLANCA20
 
15.3%
PIEDECUESTA15
 
11.5%
GIRON7
 
5.3%
RIONEGRO3
 
2.3%
AGUACHICA3
 
2.3%
SAN VICENTE DE CHUCURI2
 
1.5%
PUENTE NACIONAL2
 
1.5%
TONA2
 
1.5%
VALLE DE SAN JOSE1
 
0.8%
Other values (10)10
 
7.6%

Length

2021-07-10T22:47:50.518621image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bucaramanga66
45.2%
floridablanca20
 
13.7%
piedecuesta15
 
10.3%
giron7
 
4.8%
san4
 
2.7%
de4
 
2.7%
rionegro3
 
2.1%
aguachica3
 
2.1%
puente2
 
1.4%
chucuri2
 
1.4%
Other values (17)20
 
13.7%

Most occurring characters

ValueCountFrequency (%)
A379
26.7%
C119
 
8.4%
N114
 
8.0%
R106
 
7.5%
U91
 
6.4%
B91
 
6.4%
G82
 
5.8%
M69
 
4.9%
E64
 
4.5%
I58
 
4.1%
Other values (12)247
17.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1405
98.9%
Space Separator15
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A379
27.0%
C119
 
8.5%
N114
 
8.1%
R106
 
7.5%
U91
 
6.5%
B91
 
6.5%
G82
 
5.8%
M69
 
4.9%
E64
 
4.6%
I58
 
4.1%
Other values (11)232
16.5%
Space Separator
ValueCountFrequency (%)
15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1405
98.9%
Common15
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A379
27.0%
C119
 
8.5%
N114
 
8.1%
R106
 
7.5%
U91
 
6.5%
B91
 
6.5%
G82
 
5.8%
M69
 
4.9%
E64
 
4.6%
I58
 
4.1%
Other values (11)232
16.5%
Common
ValueCountFrequency (%)
15
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A379
26.7%
C119
 
8.4%
N114
 
8.0%
R106
 
7.5%
U91
 
6.4%
B91
 
6.4%
G82
 
5.8%
M69
 
4.9%
E64
 
4.5%
I58
 
4.1%
Other values (12)247
17.4%

nreg
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM
UNIQUE

Distinct131
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.12977099
Minimum1
Maximum134
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2021-07-10T22:47:51.319079image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7.5
Q133.5
median66
Q398.5
95-th percentile125.5
Maximum134
Range133
Interquartile range (IQR)65

Descriptive statistics

Standard deviation38.17034631
Coefficient of variation (CV)0.577203667
Kurtosis-1.18117262
Mean66.12977099
Median Absolute Deviation (MAD)33
Skewness0.01656812293
Sum8663
Variance1456.975338
MonotonicityNot monotonic
2021-07-10T22:47:52.128500image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.8%
991
 
0.8%
971
 
0.8%
961
 
0.8%
951
 
0.8%
941
 
0.8%
931
 
0.8%
921
 
0.8%
911
 
0.8%
901
 
0.8%
Other values (121)121
92.4%
ValueCountFrequency (%)
11
0.8%
21
0.8%
31
0.8%
41
0.8%
51
0.8%
61
0.8%
71
0.8%
81
0.8%
91
0.8%
101
0.8%
ValueCountFrequency (%)
1341
0.8%
1331
0.8%
1301
0.8%
1291
0.8%
1281
0.8%
1271
0.8%
1261
0.8%
1251
0.8%
1241
0.8%
1231
0.8%

Interactions

2021-07-10T22:45:24.833740image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:25.326610image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:25.823827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:26.317245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:26.732783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:27.236765image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:27.722222image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:28.132237image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:28.620559image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:29.033689image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:29.525743image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:30.023591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:30.523562image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:31.033725image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:31.616256image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:32.039889image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:32.629981image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:33.128529image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:33.629091image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:34.131288image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:34.717239image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:35.222764image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:35.737673image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:36.936036image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:37.522316image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:38.026516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:38.529914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:39.117314image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:39.616931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:40.119626image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:40.538309image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:41.037720image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:41.624557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:42.137708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:42.727161image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:43.221091image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:43.639620image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:44.130574image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:44.637961image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:45.127320image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:45.539352image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:46.028199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:46.527893image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:47.016933image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:47.438019image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:48.018622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:48.532182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:49.122462image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:49.630751image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:50.222893image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:50.724222image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:51.228253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:52.616262image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:53.125819image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:53.631274image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:54.225096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:54.717272image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:55.221056image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:55.638167image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:56.126219image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:56.630715image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:57.121688image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:57.534463image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:58.020931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:58.439295image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:58.932740image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:59.428235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:45:59.839375image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:00.336423image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:00.835398image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:01.329412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:01.837434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:02.327454image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:02.738473image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:03.224202image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:03.634867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:04.130276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:04.631061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:05.119066image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:05.619844image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:06.036154image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:06.530289image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:07.029201image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:07.522853image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:07.936954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:08.423107image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:08.916546image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:09.331661image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:09.826795image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:11.329083image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:11.832767image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:12.327993image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:12.822826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:13.328184image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:13.824018image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:14.322670image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:14.818049image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:15.317217image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:15.735829image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:16.237267image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:16.733009image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:17.316521image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:17.820616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:18.316079image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:18.822934image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:19.317890image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:19.734936image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:20.226702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:20.726433image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:21.221730image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:21.722749image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:22.226204image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:22.736456image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:23.316774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:23.819348image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:24.330490image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:24.830842image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:25.338135image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:25.837766image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:26.419297image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-10T22:46:26.926838image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-07-10T22:47:53.036301image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-10T22:47:54.616462image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-10T22:47:56.124694image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-10T22:47:57.827877image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-10T22:47:59.733432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-10T22:46:28.718228image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-10T22:46:35.317133image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-07-10T22:46:37.039819image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-07-10T22:46:38.118562image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

fec_notsemanacod_pretip_ide_num_ide_edad_nacionali_nombre_nacionalidadcod_dpto_ocod_mun_oarea_ocupacion_tip_ss_cod_ase_estrato_gp_migrantgp_gestansem_ges_gp_otroscod_dpto_rcod_mun_rfec_con_pac_hos_fec_hos_fecha_nto_fec_aju_nit_upgdversiontip_doc_rnno_ide_rnfecha_nacedad_rnsexopeso_nacertalla_nacesem_gestmult_embarnum_em_prenum_hi_vivniv_edu_manom_upgdndep_procenmun_procendep_resinmun_resinreg
02021-05-19196800100792CC3737277336170COLOMBIA6868939996SEPSS41NaN22NaN1686892021-05-1112021-05-11 00:00:001985-02-192021-05-199.000060e+09SIVIGILA - 2018 -18.2.0$0$1.0MS37372773-42021-05-1111F210045.0371341.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERSAN VICENTE DE CHUCURISANTANDERSAN VICENTE DE CHUCURI118
12021-01-2546800100701CC109871390430170COLOMBIA6819029996SEPSS411.022NaN16812021-01-252- -1990-10-242021-02-268.000842e+09SIVIGILA - 2018 - 18.2.0MS1098713904-12021-01-250M232044.0371111.0HOSPITAL LOCAL DEL NORTESANTANDERCIMITARRASANTANDERBUCARAMANGA38
22021-03-0486800100701CC109994052022170COLOMBIA6882039996SESS1331.022NaN16812021-02-2512021-02-25 00:00:001998-04-262021-04-168.000842e+09SIVIGILA - 2018 - 18.2.0MS1099940520-22021-02-257F232047.0391122.0HOSPITAL LOCAL DEL NORTESANTANDERTONASANTANDERBUCARAMANGA85
32021-01-1116800100792CC100025100420170COLOMBIA6857239996SESS024NaN22NaN1685722021-01-0712021-01-07 00:00:002000-08-192021-01-119.000060e+09SIVIGILA - 2018 -18.1.10$0$0.2MS1000251004-22021-01-074F228049.5371122.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERPUENTE NACIONALSANTANDERPUENTE NACIONAL1
42021-01-1116800100792CC1000251004-120170COLOMBIA6857239996SESS024NaN22NaN1685722021-01-0712021-01-07 00:00:002000-08-192021-01-119.000060e+09SIVIGILA - 2018 -18.1.10$0$0.2MS1000251004-12021-01-074F227045.5372122.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERPUENTE NACIONALSANTANDERPUENTE NACIONAL2
52021-01-1816800100792CC100523490924170COLOMBIA6830719996SEPSS372.02138.01683072021-01-0812021-01-08 00:00:001996-04-132021-01-189.000060e+09SIVIGILA - 2018 - 18.1.10MS10052349092021-01-0810M234548.0381012.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERGIRONSANTANDERGIRON4
62021-01-0816800100431CC119314950220170COLOMBIA6854719999SEPSS02NaN22NaN1685472021-01-082- -2000-07-312021-01-248.902088e+09SIVIGILA - 2018 -18.1.10$0$0.2MS1193149502-12021-01-0816M249047.0371012.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERPIEDECUESTASANTANDERPIEDECUESTA11
72021-01-0616827600717MSVEN3158303017862VENEZUELA68119996NNaN1.012NaN26812021-01-0612021-01-06 00:00:002003-07-172021-01-09NaNSIVIGILA - 2018 -18.1.10$0$0.2MS31583030022021-01-063M246046.0381122.0NaNSANTANDERBUCARAMANGASANTANDERBUCARAMANGA16
82021-03-12106800100792CC100513650524170COLOMBIA68119996SESS024NaN2137.016812021-03-1012021-03-11 00:00:001996-09-232021-03-129.000060e+09SIVIGILA - 2018 -18.2.0$0$1.0MS10051365052021-03-111M218547.0371022.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERBUCARAMANGASANTANDERBUCARAMANGA57
92021-03-09106800100431CC100548435328170COLOMBIA6827619999CEPS005NaN22NaN1682762021-03-0812021-03-09 00:00:001992-10-062021-03-148.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1005484353-12021-03-095F227049.0391122.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERFLORIDABLANCASANTANDERFLORIDABLANCA55

Last rows

fec_notsemanacod_pretip_ide_num_ide_edad_nacionali_nombre_nacionalidadcod_dpto_ocod_mun_oarea_ocupacion_tip_ss_cod_ase_estrato_gp_migrantgp_gestansem_ges_gp_otroscod_dpto_rcod_mun_rfec_con_pac_hos_fec_hos_fecha_nto_fec_aju_nit_upgdversiontip_doc_rnno_ide_rnfecha_nacedad_rnsexopeso_nacertalla_nacesem_gestmult_embarnum_em_prenum_hi_vivniv_edu_manom_upgdndep_procenmun_procendep_resinmun_resinreg
1212021-02-2486800100431CC100533665523170COLOMBIA68119999CEPS002NaN22NaN16812021-02-2412021-02-24 00:00:001997-07-302021-02-278.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1005336655-12021-02-243M221047.0371012.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERBUCARAMANGASANTANDERBUCARAMANGA46
1222021-02-2686800100431CC100766596020170COLOMBIA68119999SESS133NaN22NaN16812021-02-2612021-02-26 00:00:002001-01-162021-02-278.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1007665960-12021-02-261M230045.0371012.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERBUCARAMANGASANTANDERBUCARAMANGA45
1232021-02-2286800100431CC109591478432170COLOMBIA6854719999CEPS010NaN22NaN1685472021-02-2212021-02-22 00:00:001988-09-242021-02-278.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1095914784-12021-02-225F248548.0381013.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERPIEDECUESTASANTANDERPIEDECUESTA44
1242021-02-2286800101157CC110238000424170COLOMBIA6854719999CEPS0463.02137.02685472021-02-2212021-02-22 00:00:001996-04-222021-02-238.902097e+09SIVIGILA - 2018 -18.2.0$0$1.0CN1644528732021-02-221M238048.0371013.0CLINICA CHICAMOCHA SASANTANDERPIEDECUESTASANTANDERPIEDECUESTA39
1252021-02-2286800100431CC110271702033170COLOMBIA6868919999SESS024NaN22NaN1686892021-02-2212021-02-22 00:00:001987-10-112021-02-278.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1102717020-12021-02-225F228047.0371121.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERSAN VICENTE DE CHUCURISANTANDERSAN VICENTE DE CHUCURI43
1262021-02-2586800100792CEVEN2735599421862VENEZUELA6854719996NNaN1.012NaN1685472021-02-2212021-02-22 00:00:001999-03-162021-02-259.000060e+09SIVIGILA - 2018 - 18.2.0MS1644520722021-02-223M236552.0371012.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERPIEDECUESTASANTANDERPIEDECUESTA41
1272021-02-2586800100792TI109807124515170COLOMBIA6844439997SESS0621.022NaN1684442021-02-1912021-02-19 00:00:002005-06-232021-02-259.000060e+09SIVIGILA - 2018 - 18.2.0MS1644520192021-02-223F247551.0371012.0HOSPITAL UNIVERSITARIO DE SANTANDERSANTANDERMATANZASANTANDERMATANZA40
1282021-03-0596827601666CC109595630021170COLOMBIA68119996SEPSS373.022NaN16812021-03-0512021-03-05 00:00:001999-03-302021-03-07NaNSIVIGILA - 2018 -18.2.0$0$1.0CN1644729392021-03-052F221546.0371212.0NaNSANTANDERBUCARAMANGASANTANDERBUCARAMANGA50
1292021-03-0596800101157CC3279180845170COLOMBIA6827619999CEPS0053.02138.01682762021-03-0312021-03-03 00:00:001975-03-252021-03-058.902097e+09SIVIGILA - 2018 -18.2.0$0$1.0CN1644537582021-03-032F218048.0381012.0CLINICA CHICAMOCHA SASANTANDERFLORIDABLANCASANTANDERFLORIDABLANCA48
1302021-03-0496800100431TI110263459317170COLOMBIA6882019999SESS133NaN22NaN16812021-03-0412021-03-04 00:00:002003-11-232021-03-068.902088e+09SIVIGILA - 2018 -18.2.0$0$1.0MS1102634593-12021-03-042F241548.0381013.0CLINICA MATERNO INFANTIL SAN LUIS SASANTANDERTONASANTANDERBUCARAMANGA47